Search CORE

74 research outputs found

Basic Filters for Convolutional Neural Networks Applied to Music: Training or Design?

Author: Doerfler Monika
Grill Thomas
Bammer Roswitha
Flexer Arthur
Publication venue
Publication date: 05/07/2017
Field of study

When convolutional neural networks are used to tackle learning problems based on music or, more generally, time series data, raw one-dimensional data are commonly pre-processed to obtain spectrogram or mel-spectrogram coefficients, which are then used as input to the actual neural network. In this contribution, we investigate, both theoretically and experimentally, the influence of this pre-processing step on the network's performance and pose the question, whether replacing it by applying adaptive or learned filters directly to the raw data, can improve learning success. The theoretical results show that approximately reproducing mel-spectrogram coefficients by applying adaptive filters and subsequent time-averaging is in principle possible. We also conducted extensive experimental work on the task of singing voice detection in music. The results of these experiments show that for classification based on Convolutional Neural Networks the features obtained from adaptive filter banks followed by time-averaging perform better than the canonical Fourier-transform-based mel-spectrogram coefficients. Alternative adaptive approaches with center frequencies or time-averaging lengths learned from training data perform equally well.Comment: Completely revised version; 21 pages, 4 figure

arXiv.org e-Print Archive

Dryad Digital Repository (Duke University)

On Computing Morphological Similarity of Audio Signals

Author: Arthur Flexer
Martin Gasser
Thomas Grill
Publication venue
Publication date
Field of study

(Abstract to follow

ZENODO

An investigation of likelihood normalization for robust ASR

Author: Flexer Arthur
Gkiokas Aggelos
Schnitzer Dominik
Vincent Emmanuel
Publication venue: HAL CCSD
Publication date: 14/09/2014
Field of study

International audienceNoise-robust automatic speech recognition (ASR) systems rely on feature and/or model compensation. Existing compensation techniques typically operate on the features or on the parameters of the acoustic models themselves. By contrast, a number of normalization techniques have been defined in the field of speaker verification that operate on the resulting log-likelihood scores. In this paper, we provide a theoretical motivation for likelihood normalization due to the so-called "hubness" phenomenon and we evaluate the benefit of several normalization techniques on ASR accuracy for the 2nd CHiME Challenge task. We show that symmetric normalization (S-norm) reduces the relative error rate by 43% alone and by 10% after feature and model compensation

INRIA a CCSD electronic archive server

A Hybrid Approach to Music Playlist Continuation Based on Playlist-Song Membership

Author: Bertin-Mahieux Thierry
Cunningham Sally Jo
Flexer Arthur
Hamid
Hamid
Jansson Andreas
Lee Jin Ha
Logan Beth
McFee Brian
McFee Brian
Pohle Tim
Team Theano Development
Vall Andreu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/05/2018
Field of study

Automated music playlist continuation is a common task of music recommender systems, that generally consists in providing a fitting extension to a given playlist. Collaborative filtering models, that extract abstract patterns from curated music playlists, tend to provide better playlist continuations than content-based approaches. However, pure collaborative filtering models have at least one of the following limitations: (1) they can only extend playlists profiled at training time; (2) they misrepresent songs that occur in very few playlists. We introduce a novel hybrid playlist continuation model based on what we name "playlist-song membership", that is, whether a given playlist and a given song fit together. The proposed model regards any playlist-song pair exclusively in terms of feature vectors. In light of this information, and after having been trained on a collection of labeled playlist-song pairs, the proposed model decides whether a playlist-song pair fits together or not. Experimental results on two datasets of curated music playlists show that the proposed playlist continuation model compares to a state-of-the-art collaborative filtering model in the ideal situation of extending playlists profiled at training time and where songs occurred frequently in training playlists. In contrast to the collaborative filtering model, and as a result of its general understanding of the playlist-song pairs in terms of feature vectors, the proposed model is additionally able to (1) extend non-profiled playlists and (2) recommend songs that occurred seldom or never in training~playlists

arXiv.org e-Print Archive

Crossref

The neglected user in music information retrieval research

Author: A Flexer
Arthur Flexer
D Johnson
DD Lee
G Linden
GR Xue
J Callan
J Futrelle
J Urbano
JJ Aucouturier
JJ Rocchio
Julián Urbano
K Järvelin
L Azzopardi
Markus Schedl
R Baeza-Yates
S Mizzaro
T Kohonen
Y Koren
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref